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Abstract 

In the present paper we consider the problem of estimating a periodic (r + 1)- 
dimensional function / based on observations from its noisy convolution. We construct 
a wavelet estimator of /, derive minimax lower bounds for the L 2 -risk when / belongs 
to a Besov ball of mixed smoothness and demonstrate that the wavelet estimator is 
adaptive and asymptotically near-optimal within a logarithmic factor, in a wide range 
of Besov balls. We prove in particular that choosing this type of mixed smoothness 
leads to rates of convergence which are free of the "curse of dimensionality" and, 
hence, are higher than usual convergence rates when r is large. 

The problem studied in the paper is motivated by seismic inversion which can be 
reduced to solution of noisy two-dimensional convolution equations that allow to draw 
inference on underground layer structures along the chosen profiles. The common 
practice in seismology is to recover layer structures separately for each profile and 
then to combine the derived estimates into a two-dimensional function. By studying 
the two-dimensional version of the model, we demonstrate that this strategy usually 
leads to estimators which are less accurate than the ones obtained as two-dimensional 
functional deconvolutions. Indeed, we show that unless the function / is very smooth 
in the direction of the profiles, very spatially inhomogeneous along the other direc- 
tion and the number of profiles is very limited, the functional deconvolution solution 
has a much better precision compared to a combination of M solutions of separate 
convolution equations. 
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1 Introduction. 



Consider the problem of estimating a periodic (r + l)-dimensional function f(u,x) with 
u = («!,-•• ,u r ) G [0, l] r x G [0,1], based on observations from the following noisy 



convolution 

y(u,t)= [ g(u,t-x)f(u,x)dx + ez(u,t), u € [0, l] r , t € [0, 1]. (1.1) 

JO 

Here, function g(., .) is assumed to be known and z(u, t) is an r + 1-dimensional Gaussian 
white noise, i.e., a generalized r + 1-dimensional Gaussian field with covariance function 

r 

E[z(ui 3 ti)z(u 2 ,ii)] = d{h - t 2 ) Y[5(uu - u 2i ), 

i=i 

where S(-) denotes the Dirac <5-function. Denote 

h(u,t) = / g(u, t — x)f(u, x)dx. 
Jo 

Then, equation (|1.3p can be rewritten as 

y(u,t) = h{u,t) +ez(u,t) (1.2) 

In order to simplify the narrative, we start with the two dimensional version of 
equation 

y(u,t) = / g(u, t — x)f(u, x)dx + ez(u, t), u,t E [0,1]. (1-3) 
Jo 

The sampling version of problem fjl .3f) appears as 

y(u h ti) = g(ui,ti-x)f(u l ,x)dx + a€ u , I = 1, • • • , M, i = 1, • • ■ , N, (1.4) 
Jo 

where u\ = l/M, t{ = i/N and £,u are i.i.d normal variables with E(£/j) = 0, and 

ECfliiiffaia) = - Z 2)<5(n - tz). 

Equation (]1.4p seems to be equivalent to M separate convolution equations 

yi{ t i)=j fi(x)gi(ti - x)dx + azu, I = 1, ■ ■ ■ ,M, i = 1, • • ■ ,N, (1.5) 
Jo 

with = y{ui,U), fi(x) = f{u u x) and gi(U - x) = g(u u ti - x). This is, however, not 

true since the solution of equation (II. 4p is a two-dimensional function while solutions 
of equations (|1 . 5j) are M unrelated functions fi(t). In this sense, problem f)1.3|) and its 
sampling equivalent (ll.4p are functional deconvolution problems. 

Functional deconvolution problems have been introduced in Pensky and Sapatinas 
(2009) and further developed in Pensky and Sapatinas (2010, 2011). However, Pensky and 
Sapatinas (2009, 2010, 2011) considered a different version of the problem where f(u,t) 
was a function of one variable, i.e. f(u,t) = f(t). Their interpretation of functional 
deconvolution problem was motivated by solution of inverse problems in mathematical 
physics and multichannel deconvolution in engineering practices. Functional deconvolution 
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problem of types (|1.3p and (|1.4p are motivated by experiments where one needs to recover 
a two-dimensional function using observations of its convolutions along profiles u = U{. 
This situation occurs, for example, in geophysical explorations, in particular, the ones 
which rely on inversions of seismic signals (see, e.g., monographs of Robinson et al. (1996) 
and Robinson (1999) and, e.g., papers of Wason et al. (1984), Berkhout (1986)and Heimer 
and Cohen (2008)). 

In seismic exploration, a short duration seismic pulse is transmitted from the surface, 
reflected from boundaries between underground layers, and received by an array of sen- 
sors on the Earth surface. The signals are transmitted along straight lines called profiles. 
The received signals, called seismic traces, are analyzed to extract information about the 
underground structure of the layers along the profile. Subsequently, these traces can be 
modeled under simplifying assumptions as noisy outcomes of convolutions between reflec- 
tivity sequences which describe configuration of the layers and the short wave like function 
(called wavelet in geophysics) which corresponds to convolution kernel. The objective of 
seismic deconvolution is to estimate the reflectivity sequences from the measured traces. 
In the simple case of one layer and a single profile, the boundary will be described by an 
univariate function which is the solution of the convolution equation. The next step is 
usually to combine the recovered functions which are defined on the set of parallel planes 
passing through the profiles into a multivariate function which provides the exhaustive 
picture of the structure of the underground layers. This is usually accomplished by in- 
terpolation techniques. However, since the layers are intrinsically anisotropic (may have 
different structures in various directions) and spatially inhomogeneous (may experience, 
for example, sharp breaks), the former approach ignores the anisotropic and spatially inho- 
mogeneous nature of the two-dimensional function describing the layer and loses precision 
by analyzing each profile separately. 

The paper carries out the following program: 

i) Construction of a feasible procedure /(u, i) for estimating the (r + l)-dimensional 
function f(u,t) which achieves optimal rates of convergence (up to inessential loga- 
rithmic terms). We require f(u,t) to be adaptive with respect to smoothness con- 
straints on /. In this sense, the paper is related to a multitude of papers which offered 
wavelet solutions to deconvolution problems (see, e.g., Donoho (1995), Abramovich 
and Silverman (1998), Pensky and Vidakovic (1999), Walter and Shen (1999), Fan 
and Koo (2002), Kalifa and Mallat (2003), Johnstone, Kerkyacharian, Picard and 
Raimondo (2004), Donoho and Raimondo (2004), Johnstone and Raimondo (2004), 
Neelamani, Choi and Baraniuk (2004) and Kerkyacharian, Picard and Raimondo 
(2007)). 

ii) Identification of the best achievable accuracy under smoothness constraints on /. 
We focus here on obtaining fast rates of convergence. In this context, we prove that 
considering multivariate functions with 'mixed' smoothness and hyperbolic wavelet 
bases allows to obtain rates which are free of dimension and, as a consequence, 
faster than the usual ones. In particular, the present paper is related to anisotropic 
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de- noising explored by, e.g., Kerkyacharian, Lepski and Picard (2001, 2008). We 
compare our functional classes as well as our rates with the results obtained there. 



iii) Comparison of the two-dimensional version of the functional deconvolution procedure 
studied in the present paper to the separate solutions of convolution equations. We 
show especially that the former approach delivers estimators with higher precision. 
For this purpose, in Section \5\ we consider a discrete version of functional decon- 
volution problem (II. 4p (rather than the continuous equation (ll.3p ) and compare its 
solution with solutions of M separate convolution equations fjl.5j> . We show that, 
unless the function / is very smooth in the direction of the profiles, very spatially 
inhomogeneous along the other direction and the number of profiles is very limited, 
functional deconvolution solution has a better precision than the combination of M 
solutions of separate convolution equations. 

The rest of the paper is organized as follows. In order to make the paper more 
readable and due to the application to seismic inversion, we start, in Section [21 with 
the two-dimensional version of the functional deconvolution problem (jl.3p . describe the 
construction of a two-dimensional wavelet estimator of f(u,t) given by equation (ll.3p . In 
Section [3l we give a brief introduction on spaces of anisotropic smoothness. After that, we 
derive minimax lower bounds for the L 2 -risk, based on observations from (II. 3p . under the 
condition that f(u,t) belongs to a Besov ball of mixed regularity and g(u,x) has certain 
smoothness properties. In Section [H we prove that the hyperbolic wavelet estimator 
derived in Section [2] is adaptive and asymptotically near-optimal within a logarithmic 
factor (in the minimax sense) in a wide range of Besov balls. Section [5] is devoted to the 
discrete version of the problem (jl.4p and comparison of functional deconvolution solution 
with the collection of individual deconvolution equations. Section [6] extends the results to 
the (r + l)-dimensional version of the problem (II. ip . We conclude the paper by discussion 
of the results in Section [7J Finally, Section [8] contains the proofs of the theoretical results 
obtained in the earlier sections. 

2 Estimation Algorithm. 

In what follows, (•, •) denotes the inner product in the Hilbert space L 2 ([0, 1]) (the space of 
squared-integrable functions defined on the unit interval [0, 1]), i.e., (f,g) = Jq 1 f(t)g(t)dt 
for /, g G L 2 ([0, 1]). We also denote the complex conjugate of a by a. Let e m (t) = e l27Tmt 
be a Fourier basis on the interval [0,1]. Let h m (u) = (e m ,h(u,-)}, y m {u) = (e m , y(u, •)), 
z m {u) = (e m ,z(u, •)), g m (u) = {e m ,g(u,-)) and f m (u) = (e m , f(u, •)) be functional Fourier 
coefficients of functions h, y, z, g and / respectively. Then, applying the Fourier transform 
to equation f| 1 . 2 [) . one obtains for any u £ [0, 1] 



y m (u) = gm(u)fm(u) + ez m (u) 



and 




(2.1) 
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Consider a bounded bandwidth periodized wavelet basis (e.g., Meyer- type) il>j,k(t) 
and finitely supported periodized so-regular wavelet basis (e.g., Daubechies) rjji^i{v). Let 
mo and m' Q be the lowest resolution levels for the two bases and denote the scaling functions 
for the bounded bandwidth wavelet by ^> mo -i,fc(i) an d the scaling functions for the finitely 
supported wavelet by rj m i _i k i{u). Then, f(x,u) can be expanded into wavelet series as 



oo 2J-12J -1 



f(u,x)= ^2 Y Yl Yl Pj,k,f,k>'4>j,k{ x ) r ij',k'(. u )- ( 2 - 2 ) 

j=mo — 1 j'=m' — 1 A;=0 fc'=0 

Denote Pj,k(u) = {f,ipj,k), then, Pj,k,j>,k' = Wj,k(u),r] j/jk /(u)). If ifij^m = {e m ,^j,k) are 
Fourier coefficients of il>j,k, then, by formula (|2.ip and Plancherel's formula, one has 



Pj,k( U ) = Y, frn{u)^j,k,m = ^ ^- (2.3) 



m€Wj 



where, for any j > jo, 

Wj = {m : Vifem * 0} C 2vr/3[-2^+ 2 , -V] U [V ,2 j+ \ (2.4) 

due to the fact that Meyer wavelets are band- limited (see, e.g., Johnstone, Kerkyacharian, 
Picard & Raimondo (2004), Section 3.1). Therefore, /3j t k,j>,k' are of the form 

Pj,k,j',k' = Y ^iA™ I hm<y ^ rif jk '(u)du, (2.5) 
meWj 9m{u) 

and allow the unbiased estimator 

Pj,k,3',k> = Y ^3,k,m [ Vm ^\ r} jl]k >{u)du. (2.6) 

^\it J 9m{ u ) 

We now construct a hard thresholding estimator of f(u,t) as 

J -I J'-I 2 J -12 J '-1 

/CM) = Y Y Y Pjk,j'k'^jk(t)Vfk'(u) (2-7) 

i=m -l j'=m£,-l fc=0 fc'=0 



where 



Pj,k,j',k' — f3j,k,j',k>l ( Pj,k,j',k 



>X j£ ). (2i 



and the values of J, J' and A J£ will be defined later. 

In what follows, we use the symbol C for a generic positive constant, independent 
of s, which may take different values at different places. 
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3 Smoothness classes and minimax lower bounds 



3.1 Smoothness classes 

It is natural to consider anisotropic multivariate functions, i.e., functions whose smooth- 
ness is different in different directions. It is, however, much more difficult to construct 
appropriate spaces of mixed regularity which are meaningful for applications. One of the 
objectives of the present paper is to prove that classes of mixed regularity allow to obtain 
rates of convergence which are free of dimension. This is specifically due to the applica- 
tion of hyperbolic wavelets, i.e., wavelets which allow different resolution levels for each 
direction (see, e.g., Heping (2004)). 

Although comprehensive study of functional classes of mixed regularity is not the 
purpose of this paper, below we provide a short introduction of functional classes that 
we are going to consider, due to relation of this paper to anisotropic de-noising explored 
by Kerkyacharian, Lepski and Picard (2001, 2008), we also compare classes of mixed 
regularity used therein to the Nikolski classes considered in the papers cited above. 

First, let us recall definition of the Nikolski classes B^' "' S p d l ^ (see Nikolskii (1975)). 
In this section we consider d dimensional multivariate functions. In what follows, we set 
d = r+ lovd = 2. 

Let / be a measurable function defined on R d . For any x, y G we define 

A y f(x) = f(x + y)-f(x). 

in G N then A^ is the /—iterated version of the operator A y . (Of course A° = 1^.) Then, 
Nikolski classes can be defined as follows: 

1. Let ei, ....ed be the canonical basis of M. d . For < Si < oo; 1 < p, L < oo, we say that 
/ belongs to 5** if and only if there exists I G N, Sj < I, and C(si,l) < oo, such 
that for any h G R one has 

\Kj\\ LPmd4x) <c( St ,i)\h\ s >. 



o n( s lr-) s il) f-sd r>Si 

Z - *(pi,...*d),°° ~ 1 , i=l- £f K,oo 

The Nikolski classes defined above were investigated by Kerkyacharian, Lepski and 
Picard (2001, 2008), they are anisotropic but do not involve mixed smoothness. Quite 
differently, in the present paper we shall consider classes of mixed regularity defined as 
follows. Denote h = (h±, . . . , h^), t = (ti, . . . , td), s = (s±, . . . , Sd) and let U > 0, Si > 0, 
i = 1, ■ ■ ■ , d. For a subset e C {1, . . . , d}, we set h e to be the vector with coordinates hi 
when i belongs to e, and otherwise. For a fixed integer I and 1 < p < oo, we denote 



Afc/(x) := ( I] I f(x), ^(fX) P ■= sup ||A^/|| p . 

\hj\<tj 
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Now, in order to construct Besov classes of mixed regularity, we choose I > maxj Sj and 
define 



TjSl,...,S d 

-"poo 



f € h p , Yl su p su p h S3Ql ' e (f> t e )p < 00 \ ■ 



(3.1) 

+\n " II 

eC{l,...,d} 

It is proved in, e.g., Heping (2004) that under appropriate (regularity) conditions which 
we are omitting here, classes (|3.ip can be expressed in terms of hyperbolic- wavelet coef- 
ficients, thus, providing a convenient generalization of the one-dimensional Besov -Bp iOC 
spaces. Furthermore, Heping (2004) considers more general Besov classes of mixed reg- 
ularity Bp]q "' Sd that correspond to q < 00 rather than q = 00. In this paper, we shall 
assume that the hyperbolic wavelet basis satisfies required regularity conditions and follow 
Heping (2004) definition of Besov spaces of mixed regularity 



p,q 



f € L\U) : 



f ^ 2 <E^*[«+W]>< ( £ \P JlM ..., jdkdl 



£\ 1/9 

v 



< OO 



(3-2) 

Besov classes (j3.2[) compare quite easily to the Nikolski classes: it is easy to prove that 
the former form a subset of the latter. 



3.2 Lower bounds for the risk:two-dimensional case 

Denote U = [0, 1] x [0, 1] and 

s* = Si + 1/2 - l/p, s'i = Si + 1/2 - l/p', 2 = 1,2, p' = min{p,2}. 



(3.3) 



In what follows, we assume that the function f(u,t) belongs to a two-dimensional Besov 
ball as described above (d = 2), so that wavelet coefficients Pjkj'k' satisfy the following 
condition 



f G L\U) : 



k,k' 




< A 



(3.4) 



Below, we construct minimax lower bounds for the L 2 -risk. For this purpose, we define 
the minimax L 2 -risk over the set V as 

R e (V) = M su P E||/-/|| 2 , 
/ /ev 

where ||<7|| is the L 2 -norm of a function g(-) and the infimum is taken over all possible 
estimators /(•) (measurable functions taking their values in a set containing V) of /(•)• 

Assume that functional Fourier coefficients g m (u) of function g(u,t) are uniformly 
bounded from above and below, that is, there exist positive constants u, and C\ and C2, 
independent of m and u such that 

Ci |m|~ 2iy < \g m {u)\ 2 < C2 \m\~ 2u . (3-5) 
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Then, the following theorem gives the minimax lower bounds for the L 2 -risk of any esti- 
mator f n of /. 

Theorem 1 Let min{si,S2} > max{l/p, 1/2} with 1 < p,q < oo, let A > and s^, 
i = 1,2, 6e defined in i3.3}) . Then, under assumption \3. 5|) . as e — > 

ii £ (i?^(^))>CA 2 ^ (3.6) 

where 

. ( 2s 2 2st 2s[ \ 

d = mm , , — ; — - — . (3.7 

\2s 2 + l 2si + 2u + V 2s[ + 2uJ v ; 

Note that the value of d in (13.71) can be re-written as 



d 



2§^, if Si > S 2 (2l/ + 1), 

^tri^ if (J " l)( 2 ^ + !)<*!< + 1), (3.8) 



2^ 



if S1 <(I-1)(2^ + 1). 



4 Minimax upper bounds. 

Before deriving expressions for the minimax upper bounds for the risk, we formulate 
several useful lemmas which give some insight into the choice of the thresholds Xj E and 
upper limits J and J' in the sums in ([27 



Lemma 1 Let /3j ; k,j',k' be defined in \2. 6\) . Then, under assumption A3.5\) , one has 

^(ft, w )x £ 2 2^. (4.1) 
Lemma [1] suggests that thresholds A, e should be chosen as 



X j£ = Cp^Hl/e) 2 JU e (4.2) 
where Cp is some positive constant independent of e. We choose J and J' as 

2 J = (e 2 )-^r, 2 J ' = (e 2 )-\ (4.3) 

Note that the choices of J, J' and \j £ are independent of the parameters, s±, s 2 , p, q and 
A of the Besov ball B^ 2 (A) , and therefore our estimator (|2.7p is adaptive with respect 
of those parameters. 

The next two lemmas provide upper bounds for the wavelet coefficients and the large 
deviation inequalities for their estimators. 

Lemma 2 Under assumption |g.^[ ), one has 

E*E 1/w.H 2 < a^m'^ 

k=0 k'=0 

for any j,f > 0. 



S 



Lemma 3 Let Pj,k,j',k' an d ^je be defined by formulae \2. b}) and {4- b <ty , respectively. De- 
fine, For some positive constant a, the set 

®jk,j'k',a = {@ : Pj,k,j>,k> — Pj,k,j',k' > Ct\j e }. (4.4) 

Then, under assumption It 3. 5\) , as e — > 0, one has 

„2~2 



Pr {@ jk ,fk',a) =0\e 2CT o [ln(l/e)]-2 (4.5) 



where <7 2 = (^f ) 2l/ (j- and C\ is defined in &3.5\) . 

Using the statements above, we can derive upper bounds for the minimax risk of 
the estimator (|2.7p . 

Theorem 2 Let /(., .) be the wavelet estimator defined in with J and J' given by 

Let condition \3. 5|) hold and minjsi,^} > max{l/p, 1/2}, with 1 < p,q < oo. // 
Cp in j4.2\ ) is such that 

C} > 80(Ci)" 1 (2^/3) 2l/ (4.6) 
where C\ is defined in h3. 5|) . then, as e — > 0, 



sup E||/-/f <CA' (£lMiMy,n(if (4.7) 

where d is defined in jg. 7\ ) and 

di = l(si = s 2 (2^ + 1)) + 1( S1 = {2v + l){\/p - 1/2)). (4.8) 

5 Sampling version of the equation and comparison with 
separate deconvolution recoveries 

Consider now the sampling version (|1.4|) of the problem (|1.3|) . In this case, the estimators 
of wavelet coefficients Pjkj' k' can be constructed as 



Pj,k,f,k' = ^ Yl ^i> k > m Vj',k'(ui)- (5-1) 



In practice, Pj : k,j',k' are obtained simply by applying discrete wavelet transform to vectors 

y m {-)/g m {-)- 

Recall that the continuous versions (|2.6p of estimators (|5.ip have Var (^Pj,k,j',k'^j x 
£ 2 2 2 i y (see formula (|4.ip ). In order to show that equation (|1.4p is the sampling version of 
(|1.3p with e 2 = a 2 /(MN), one needs to show that, in the discrete case, Var (j3j,k,j',k' 
a 2 (M N)~ l 2 2 i u . This indeed is accomplished by the following Lemma. 



9 



Lemma 4 Let /3j kj',k' be defined in h5. Then, under assumption hS.S\) , as MN — > oo, 
one has 

Var (Pj,k,f,k') ^ a 2 {MNY l 2^ v . (5.2) 

Using tools developed in Pensky and Sapatinas (2009) and Lemma HI it is easy to 
formulate the lower and the upper bounds for convergence rates of the estimator (j2.7|) with 
f3jk,j'k' given by (|2.8p and the values of Xj e and J, J' defined in (|4.2j) and (|4.3jl . respectively. 
In particular, we obtain the following statement. 

Theorem 3 Let min{si,S2} > max{l/p, 1/2} with 1 < p, q < oo, let A > and s* be 
defined in A3.3\) . Then, under assumption \3. 5\) , as MN — > oo, for some absolute constant 
C > one has 

R ( MN)(B s p : q S2 (A)) > Cia^MN)- 1 )*. (5.3) 

Moreover, if /(., .) is the wavelet estimator defined in {2. 7| ), minjsi, s{\ > max{l/p, 1/2}, 
and J and J' given by |^.3| ), then, under assumption $3. 51) . as MN — > oo, 

sup E||/-/|| 2 < C(a 2 (MN)~ 1 ln{MN)) d (ln(MN)) dl . (5.4) 



where d and d\ are defined in (3.1) and |^.<g|), respectively. 



Now, let us compare the rates in Theorem [3] with the rates obtained by recovering 
each deconvolution fi(t) = f(ui,t), ui = l/M, I = 1, • ■ , M, separately, using equations 
(ll.5p . In order to do this, we need to determine in which space functions fi(x) are con- 
tained. The following lemma provides the necessary conclusion. 

Lemma 5 Let f G Bp]q 2 {A) with s\ > max{l/p, 1/2}, S2 > max{l/p, 1/2} and 1 < 
p, q < oo. Then, for any I = 1, M , we have 

/,(*) = f(u h t) e B£ q (A). 

Using Lemma [5] and standard arguments (see, e.g., Johnstone, Kerkyacharian, Pi- 
card and Raimondo (2004)), we obtain for each fi 

l l- 



CN , if Sl >(i-A)(2z/ + 1), 

sup EWft-fiW 2 X 1 _ 2s [ 

htB s P } q (A) y CN if si < (± - i)(2i/ + 1). 

Now, consider estimator / of / with f(ui,tj) = fi{ti). Taking into account that 

M 

z=i 

we derive 



EH/-/II 



2 



2.- 



CW w+i, if Sl > (I_ i)(2i/ + l), 

2s', 



(5.5) 



CiV M^ 7 , if Sl < (I - i)(2i/ + l) 
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By straightforward calculations, one can check that the only case when convergence 
rates of separate deconvolution recoveries can possibly be better than that of the simul- 
taneous estimator is when s± > S2{2v + 1). In this case, s\ > (| — \){2v + 1), so that 
comparing the rates, by straightforward calculations we derive that simultaneous recovery 
delivers better precision than separate ones unless 

lim MN~ s2(2n+2,+i) < 1, Sl > s 2 {2v + 1). (5.6) 

M— >oo 

N^oo 

It is easy to see that relation (15, 6p holds only if s\ is large, S2 is small and M is relatively 
small in comparison with N. 

6 Extension to the (r + l)-dimensional case 

In this section, we extend the results obtained above to the (r + l)-dimensional version of 
the model (jl.lj) . In this case, expanding both sides of equation (II. ip over Fourier basis, 
as before, we obtain for any u 6 [0, l] r 

(u) = g m (u)f m (u) + ez m (u). 

Construction of the estimator follows the path of the two-dimensional case. With tpj t k(t) 
and T)j' ; k'{u) defined earlier, we consider vectors j' = ■ ■ ■ ,j' r ), k' = (k[, • • • , k' r ), m' = 
(m^,-- - ,m' r ) and J' = (</{,••• , J' r ), and subsets T(m',J') and /C(j') of the set of r- 
dimensional vectors with nonnegative integer components: 

T(m , ,J , ) = tf:m[<j' l <f l , 1 = 1,... ,r}, /C(j') = {k' : < k[ < j[ - 1, I = 1, ■ ■ ■ ,r}. 

If oo is the r-dimensional vector with all components being oo, one can expand /(u, t) 
into wavelet series as 

oo 2^-1 r 

f(u,t)= Yl Y Y Y pjkj'vipjkWYlrjji^iui), (6.i) 

j=m -l k=0 j'eT(m',00) k'e/C(j') 1=1 

where coefficients /3jfc,j',k' are of the form 

Pjkf,U= Y ^j,k,m [ hmU f\[ Vjl k ,(ui)]du, (6.2) 
^ J m d 9m ^ ti 

the set Wj is defined by formula (I2.4j) and /i m (u) = ((/ * #)(-, u), e m (-)). Similarly to the 
two-dimensional case, we estimate /(u, i) by 

J-l 2^-1 r 
j=m -l fc=Q j'eT(m',J')k'e/C(j') /=! 
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with 



Here 



> A 



(6.4) 



(6.5) 



are the unbiased estimators of /3jfcj',k', J is defined in (14. 3D . J/ are such that 2 17 ;' = e~ 2 , 
Z = 1, • • • , r, and Aj j£ is given by formula (|4.2p . 

Assume, as before, that functional Fourier coefficients g m (u) of function g(u,t) are 
uniformly bounded from above and below 



Ci\m\ 2u < |g m (u)| 2 < C 2 \m\ 2v 



(6.6) 



and that function /(u, i) belongs to an (r + l)-dimensional Besov ball. As described in 
section I3TT1 to define these Besov balls, we introduce the vector S2 = (s2i> - '' ,$2r) and 
denote by and vectors with components s' 2l = S2i + l/2 — l/p' and = S2i + l/2 — l/p, 
I = 1, ••• ,r, respectively, where p' = min{p, 2}. If sq > max^z, then the (r + 1)- 
dimensional Besov ball of radius A is characterized by its wavelet coefficients /3j,fc,j',k' as 
follows (see, e.g. Heping (2004)) 



( 



f€L\[0,l] r 



1\ 1/9 

V 



E 2bwTs51 MEI/w, 



i',k' 



< A \ . (6.7) 



,fe,k' 



It is easy to show that, with the above assumptions, similarly to the two-dimensional case, 
as e — > 0, one has 



2J_l 2-7 -1 



fc=0 k'=0 



Var x £ 2 2 2 ^, E 1/W.kf < A 2 2~^+y T ^, 

Pr ( 



Pj,kj',k> ~ ^,fcj',k' > «A i£ = O e 2CT o pn(l/e)] 



The upper and the lower bounds for the risk are expressed via 



s 2 ,o = mm s 2 ,i = s 2 ,i , 
1=1, — ,r 



where Zq = argmins2Z- In particular, the following statements hold. 



(6.8) 



Theorem 4 Let min{si, S2,z } — niax{l/p, 1/2} with 1 <p,q< oo. Then, under assump- 
tion ( tg. 6\) . as e — > 0, 



r £ (b;^(a))>ca 2 (j^j 



D 



(6.9) 
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where 



or. 



D = min 



( 2S2,0 



2s 



2.0 



2si 



2s[ 



2s 2 ,o + 1 ' 2si + 2^ + 1 ' 2s; + 2u 



D 



2s20 +i> »/ *l > 8 2 ,o(2v + l), 
^t+l, if (| - ±)(2z/ + l)<«i< s 2 , (2^ + 1), 



(6.10) 



(6.11) 



2.s 



2s\+2v' 



if ai<(i-|)(2i/ + l) 



Theorem 5 Ze£ /(., .) 6e i/te wavelet estimator defined in \6. 3\) , with J defined in U^M , 
J[ such that 2 i = (e 2 ) -1 , I = 1, , r, and \j j£ given by formula b4-£fy . Let condition 
\3. 5\) hold and rnin{si, s 2 ,o} > max{l/p, 1/2}, with 1 <p,q < oo. If Cp in satisfies 
condition f/ien, as e — > 0, 



sup E||/-/|| 2 < CA 2 {A- 2 e 2 Hl/e)) D ln(l/e) Dl 

/GB^' S2 (A)) 



.12) 



where D is defined in i6.10\) and 

Dx = l(si = s 2 ,o(2^ + 1)) + l(si = (2i/ + l)(l/p - 1/2)) + 1( S2)J = s 2 , ). (6.13) 

Remark 1 Observe that convergence rates in Theorems H] and [5] depend on s\, p, v and 
min; S21 but not on the dimension r. 



7 Discussion. 

i) In the present paper, we constructed functional deconvolution estimators based on 
the hyperbolic wavelet thresholding procedure. We derived the lower and the upper 
bounds for the minimax convergence rates which confirm that estimators derived in 
the paper are adaptive and asymptotically near-optimal, within a logarithmic factor, 
in a wide range of Besov balls of mixed regularity. 

ii) Although results of Kerkyacharian, Lepski and Picard (2001, 2008) have been ob- 
tained in a slightly different framework (no convolution), they can nevertheless be 
compared with the results presented above. Set v = to account for the absence 
of convolution, Pi = p and d = r + 1. Then, convergence rates in the latter can be 
identified as rates of a one-dimensional setting with a regularity parameter which is 
equal to the harmonic mean 

(\ iv 1 

s= 1 -| < mm Sj. 

\Sl SdJ i=l,-,d 

In our case, the rates can also be identified as the rates in the one-dimensional 
setting with a regularity parameter minj Sj which is always larger than s. Moreover, 
if Si = s, one obtains s = sd > s = minsj, showing that estimators of Kerkyacharian, 
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Lepski and Picard (2001, 2008) in the Nikolski spaces are affected by "the curse 
of dimensionality" while the estimators in the anisotropic Besov spaces of mixed 
regularity considered in this paper are free of dimension and, therefore, have higher 
convergence rates. 

hi) The problem studied in the paper is related to seismic inversion which can be reduced 
to solution of noisy convolution equations which deliver underground layer structures 
along the chosen profiles. The common practice in seismology, however, is to recover 
layer structures separately for each profile and then to combine them together. It is, 
however, is usually is not the best strategy and leads to estimators which are inferior 
to the ones obtained as two-dimensional functional deconvolutions. Indeed, as it is 
shown above, unless function / is very smooth in the direction of the profiles, very 
spatially inhomogeneous along another dimension and the number of profiles is very 
limited, functional deconvolution solution has precision superior to combination of 
M solutions of separate convolution equations. The precise condition when separate 
recoveries are preferable to the two-dimensional one is given by formula (|5.6p which, 
essentially, is very reasonable. Really, if the number M of profiles is small, there 
is no reason to treat / as a two-dimensional function. Small value of S2 indicates 
that / is very spatially inhomogeneous and, therefore, the links between its values 
on different profiles are very weak. Finally, if s\ is large, deconvolutions are quite 
precise, so that combination of various profiles cannot improve the precision. 
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8 Proofs. 

8.1 Proof of the lower bounds for the risk. 

In order to prove Theorem [H we consider two cases, the case when f(u, t) is dense in both 
variables (the dense-dense case) and the case when f(u,t) is dense in u and sparse in t. 
The proof is based on Lemma A.l of Bunea, Tsybakov and Wegkamp (2007) which we 
reformulate here for the case of squared risk. 

Lemma 6 [Bunea, Tsybakov, Wegkamp (2007), Lemma A.l] Let Q be a set of functions 
of cardinality card(£l) > 2 such that 

(i) ||/-5|| 2 >45 2 ; for f, gen, f + g, 

(ii) the Kullback divergences K(Pj,P g ) between the measures Pf and P g satisfy the in- 
equality K(P f ,P g ) < log(carri(n))/16, for f,gett. 

Then, for some absolute positive constant C , one has 

inf sup E f \\T n -ff >C5 2 . 



14 



The dense-dense case. Let u be the matrix with components uikk' = {0, 1}, 
k = 0, • • • , 2 J ' — 1, k' = 0, • • • , 2 J — 1. Denote the set of all possible values w by and let 
the functions fjji be of the form 



2J-1 2^ -1 



fjf(t,u) =jjj> ^2 E UJ k,k'i>jk{t) , njik'{u). (8.1) 

fc=0 fc'=0 

Note that matrix cj has iV = 2 J+J components, and, hence, cardinality of the set of 
such matrices is card(f2) = 2 N . Since fjj' G B^ 2 {A) , direct calculations show that 

7#/ < ^2-^ Sl + 1 /2)-i'(^+l/2) ! go that we chooge = A2 -j( Sl + l/2)-j'(s 2 +l/2)_ Jf ^ ig of 

the form (|8.ip with ^/ G O instead of OJkk'i then, the L 2 -norm of the difference is of the 
form 

2J-1 2-j'-l 
fc=0 fe'=0 

./ 

where p(u),uS) = Ylk=o J2k'=o ^-\^k,k' w fc,fc') is the Hamming distance between the 
binary sequences oj and Co. In order to find a lower bound for the last expression, we apply 
the Varshamov-Gilbert lower bound (see Tsybakov (2008), page 104) which states that 
one can choose a subset Q\ of O, of cardinality at least 2 N / S such that p(Co,uj) > N/8 for 
any lo,lo G f2x- Hence, for any w,u) G Qi one has \\fjj> — fjj>\\ 2 > Note that 

Kullback divergence can be written as 

#(/,/) = (2 £ 2 )- 1 ||(/-/)* 5 || 2 . (8.2) 

Since \wjjf — uijf\ < 1, plugging / and / into (j8.2l) , using Plancherel's formula and recalling 
that \ipj,k,m\ < 2~- ? / 2 , we derive 

2^ 1 2^ 1 i 

K(f,f) < (2e 2 )- 1 2-^,^^ l\}k>W)g 2 m {u)du. 

k=0 k'=0 m&Wj 

Using (13. 5p . we obtain 
so that 

/) < Ce-^y+^-W. (8.3) 
Now, applying Lemma [6] with 

5 2 = ^,2^/32 = A 2 2- 2sij ~ 2s2j ' /32 (8.4) 
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one obtains constraint 2 3(. 2a i+ 2v + 1 ) i'(2s 2 +i) < Ce 2 /A 2 on j, j' and e where C is an 
absolute constant. Denote 

T £ = \og 2 {CA 2 e- 2 ). (8.5) 

Thus, we need to choose combination of j and f which solves the following optimization 
problem 

2jsi + 2/s 2 min j(2si + 2i/ + 1) + j'(2s 2 + 1) >r £ , j,f>0. (8.6) 

It is easy to check that solution of this linear constraint optimization problem is of the 
form {j,f} = {(2 ai + 2u + l)" 1 ^} if s 2 {2v + 1) > s h and {j,f} = {0, (2s 2 + l)" 1 ^} 
if s 2 (2v + 1) < s±. Plugging those values into fj8.4j> . obtain 

CA 2 (e 2 /A 2 )^, if a x > s 2 {2v + 1), 
CL4 2 (eVA 2 ) 2 -!^ 1 , if ai < s 2 {2v + l). 

The sparse-dense case. Let u) be the vector with components u)y = {0, 1}. De- 
note Vt the set of all possible uj and let the functions fjj/ be of the form 

v' -X 

fjj'{t,u) = jjj< ^2 Uk'ipjk(t)Vj'k'(u) (8.8) 

fc'=0 

Note that vector u has N = 2° components, and, hence, its cardinality is card($7) = 2^. 
Since fjji £ Bp^ 2 (A), direct calculations show that jjf < A2~i s *^~i ( s 2+ 1 / 2 ) ( so we choose 
jjj/ = A2~i s i~i'( S2+1 / 2 \ If fjj> is of the form (j8.8|) with u)k,k' e ^ instead of Wfc,fc', then, 
calculating the L 2 norm of the difference similarly to dense-dense case, obtain 

2 i '-l 
fc'=0 

Similarly to dense-dense case, using formulae (|3.5p and (j8.2|) . Plancherel's formula and 
\ipj,k,m\ < 2" J '/ 2 , derive 

Jf(/,/) < (2e 2 )- 1 7 2 ? , £ 2-i f 4Au)9 2 m (u)du < C(2e 2 yW j3 ^^. 

Now, applying Lemma [6] with 

5 2 = 7^/2^/32 = A 2 2~ 2s 'i J - 2s2j "/32 (8.9) 

one obtains constraint 2 - - ? ( 2s 'i +2 ^~- ? '( 2s2+1 ) < Ce 2 /A 2 on j, f and e where C is an absolute 
constant. Thus, we need to choose combination of j and j' which delivers solution to the 
following linear optimization problem min{2jsi + 2f s 2 } subject to constraint 

2jsx + 2j's 2 => min s.t. j{2s[ + 2v) + j'{2s 2 + 1) > r e , j,f > 0. (8.10) 
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It is easy to check that solution of this linear constraint optimization problem is of the form 
{j,f} = {(2s[ + 2u)- 1 t £ ,0} if 2us 2 > s[, and {j,f} = {0, (2s 2 + l)" 1 ^} if 2vs 2 < s[. 
Plugging those values into (|8.9fl . obtain 



1.11) 



CA 2 {e 2 /A 2 ) 2 *2+\ if 2vs 2 <s' 1 , 
CA 2 {e 2 /A 2 )^+ r % \i2vs 2 >s' l . 
In order to complete the proof, recall expressions (|3.T[) and (|3.8p for d. 

8.2 Proofs of supplementary lemmas. 

Proof of Lemma (TJ Let us derive an expression for the upper bound of the variance 
of RIM . Subtracting (p3|) from (p^) we obtain 



Pj,k,j',k' - Pj,k,j',k' = £ V'j.fc.m / 



1 z m Q) 
Jm(«) 



Vj',k'(u)du. 



1.12) 



Now, before we proceed to the derivation of the upper bound of the variance, let us first 
state a result that will be used in our calculation. Recall from stochastic calculus that for 

omr fnTinfinn TP I + o,\ d T % ( \Cl 11 V- TO ll\ 



E 



l ri 



o J o 



u)dz(t, u)du 



has 




1 2 


/7 




JO JO 



F 2 (t,u)dtdu. 



1.13) 



Hence, recalling that z m {u) = j z(u,t)e m (t)dt, choosing 



9mW 



squaring both sides of (18. 12|) . taking expectation and using the relation (18.13p . we obtain 

2 



Var ( Pj,k,j',k' 



e 2 E 



1 rl 



7 

J 



Vj',k'(u) 



^ ^ ^j,k,m 



e m (t)dz(u, t)du 



o „ 4f 9m{u)g m i(u) 



9m{u) 

e^(t) e m / (i) | r/ 3 v jfc / (u) 1 2 (Mu 



fc,m| 



2 Z" 1 l»?j',fc'(«)| : 



m<=Wi 



l9m(«)| 2 



since in the double summation above, all terms involving m 7^ m! vanish due to e m (t)e m > (t)dt 
0. Consequently, Taking into account (|2.4p . (|3.5p and the fact that \t/jj ; k,m\ — 2~ jf / 2 , obtain 

Var (Pjwp) ^ e 2 E fe.mlVr f \ V 2 ,, k ,(u)\du x e 2 2 2 > (8.14) 
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so that (HH holds. 

Proof of Lemma [2] First note that, under assumption (13. 4p . one has 

e i/w.*r < ^2- p [ {jsi+/s2)+( ^^ )o ' +/) ] 

If p < 2, one has p' = p, s' { = Si + 1/2 — 1/p, i = 1,2, and 

r i (2-p)/p 

If p > 2, then p' = 2, s- = Sj, i = 1,2, and, applying the Cauchy-Schwarz inequality, one 
obtain 

/ v 2/p / v (1-2/p) 

El/wf * EI/wH E 1 <ah-w^)\ 

k,k' \k,k' j \k,k' J 

which completes the proof. 

Proof of Lemma [3] Observe that Pj t k,j',k' ~ Pj,k,j',k' ls a zero-mean Gaussian 
random variable with variance given by (I8.14h . so that 

Var ( < e 2 (f ) ^ = (8.15) 

Denoting by = 1 — where is the standard normal c.d.f. and recalling that 
$(x) < (xa/^) -1 exp(— x 2 /2) if x > 0, we derive 

Pr (Sljkj'Hfit) = Pr [\£j,k,f,k>\ > aXje) = 2$ (aA ie (aoe2^) _1 ) 

< 2$ (aC^o^VRiM) < — 2(70 , . , e~^l 



o 

which completes the proof. 

8.3 Proof of upper bounds for the risk. 
Proof of Theorem [2] Denote 

Xe,A = A- 2 e 2 ln(l/e), (8.16) 

2 J0 = (x £ ,a)"^, 2^ = ( Xm )"4 (8.17) 
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and observe that with J and J' given by (|4.3|) . the estimation error can be decomposed 
into the sum of four components as follows 

E||/ n -/|| 2 < Yl E \\hk,j',k'-(3j,k,rM\ 2 <Ri + R2 + Rs + Ra, (8.18) 

j,k,j',k' 

where 



Ri 



Ri 



R\ 



2 m 0-l 2 m o-l 

Yl Y] Var(/3 mo>fcim / ifc /), 

fc=0 fc'=0 

j-i J'-i 

E E E E 

j=mo j'=m' k,k' 
J-l J'-l 

Y Y Y lft-.wl 2pr 

j=m j'=m' k,k' 

CO J' — l J—l CO CO CO \ 

E E + E E+EE El/w. 

\i= J i'=m' Q j=m j'=J' j=Jj'=J'J k,k' 









2 1( 


Pj,k,j',k> 


> A*) 



< A 



For using (14. lj) . derive, as e — >• 0, 



To calculate R4, we apply Lemma [2] and use f|4.3[) obtaining, as e — >• 0, 



(8.19) 



R 1 



Y + E E j 4 2 2- 2js 'i- 2j " s 2 = O (a 2 2~ 2J ^+A 2 2 



Y 

i>J j'>m' j>m j'>J' 

O ( A 2 (e?)A: + A 2 (e 2 ) 2s '* 



-2J's 2 



0{A 2 X i A ). 



(8.20) 



Then, our objective is to prove that, as e — > 0, one has Ri = O \A?x^ A [ln(l/e)] dl J . 

Now, note that each R2 and R3 can be partitioned into the sum of two errors as 
follows 



R2 < R21 + R22, R3 < -^31 + R32, 



(8.21) 
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where 
R21 



R 



22 



R 



31 



J-l J'-l 

E E£ E 

j=mo j'=m' k,k' 
J-l J'-l 

j=mo j'=m' k,k' 
J-l J'-l 

Y Y Y lfe',fe'l 2pr 

j=m j'=m' Q k,k' 
J-l J'-l 



Pj,k,j',k' — Pj,k,j',k' 1 I Pj,k,j',k' — Pj,k,j',k' 



> 



'j,k,j',k' ~ Pj,k,f,k' 



1 \P. 



'j,k,j',k' 



J? 



R 



:\2 



Y Y Y\fy*j'M 

j=mo j'=m' k,k' 



1 \P: 



Pj,k,j',k' — Pj,k,j',k' 

3A 



> 



j,k,j',k' 



< 



(8.22) 
(8.23) 
(8.24) 
(8.25) 



Combining ()8.22j) and (|8.24p . and applying Cauchy-Schwarz inequality and Lemma [3] with 
a = 1/2, one derives 



J-i J'-i 



R 21 + R 31 = O I Y 2i+i '^ 16CT o [l n (l/e)]-i yf^¥~i 



O I 2 J ^ +1 ) 2 3J '/ 2 ( £ ) + ^l I = O I (e 



-p 3 

2^ 32aj> 2 



Hence, due to condition (|4.6p . one has, as e — )• 0, 

i?2i + #31 <Ce 2 = (a 2 Xs a 



For the sum of R22 and R32, using (I4.ip and (I4.2j) . we obtain 



(8.26) 



A = R 22 + R 32 = o( J2 Y Y min I { krjy ~ 2 HVe) 2 2 ^} j . (8.27) 

yj=mo j'=m' k,k' J 

Then, A can be partitioned into the sum of three components Ai, A2 and A3 according 
to three different sets of indices: 

(( J-i J'-i J-i J'-i \ \ 
\ Y Y + Y Y \ A 2 2- 2 ^- 2 i' s 'i , (8.28) 
{j=jo+lj'=m' j=m j'=j' 0+ lj ) 

( jo fo / x 

A 2 = ° ( Y Y z 2 Hl/e) 1 < X d e j) I ($.29) 

: =mo j'=m' Q 

A 3 = °( E Y ^ p V^ s 'i-^(£ 2 ln(^^^^ 

Vi= m j'=m' 
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where d is defined in (13.70 . It is easy to see that for Ai given in (18.281) and jo and j' given 
by (|8.17p . as e — > 0, one has 

A 1 = 0[A 2 X d £A ), (8.31) 

For A 2 defined in (1Q9D . obtain 

A 2 = O (e 2 \n(l/e) X d -}) = O (a 2 xt, A ) > ^ 0. (8.32) 



In order to construct upper bounds for A3 in (|8.30p . we need to consider three different 
cases. 

Case 1: si > S2(2v + 1). In this case, d = 2s2/(2s2 + 1) and 

jo Jo 
A 3 < CA 2 (x £ ,a) 1 ' p ' /2 2-M<- 2 ^ 1 -p'' 2 ^ J2 2 ~ P ' fS ' 2 1 f 2 '" > M*" 12 " 3 '^ 11 

J=mo j'=m' 
jo 

, 1 -2u(l-p'/2)-p'{2u+l)s' 2 ] 



< (7A 2 (x e A )( 1 -P'/2)+P , 4(l-'i) 2 _J ' [pV i 



j=m 

Jo 

p'ai-p'a 2 (2i/+l)] 



so that, as e — >• 0, 

A 3 = O (A 2 x d £ A [ln(l/e)] 1(si=S2(2y+1)) ) . (8.33) 



Case 2: (± - \){2v + 1) < s x < s 2 (2f + 1). In this case, d = 2s 1 /(2s 1 + 2v + 1) and 



jo Jo / ., v 

A3 < CA 2 {xe,A) l ' P '' 2 E 2-^'<-Ml-P'/2)] £ 2-^ V al(2^>(xM)^2-^J 



< CA 2 (x £ ,A) (1_P ' /2)+P '^ (si_(2i/+1)(W_1/2) E 2_iV[s ' 2 ' 



Jo 



-si/(2i/+l)+(l/2-l/p')] 



j'=ra 



J6 



so that, as e — > 0, 



j'=m 



A 3 = 0(A 2 4 A j. (8.34) 
Case 3: s x < (± - ±)(2z/ + 1). In this case, (i = 2s^/(2s^ + 2^) and p < 2. Then, since 



ps' x - 2i/(l - p/2) = - (1/p - 1/2) (2^ + 1)] < 0, one has 

jo 

A 3 < CA 2 (xs,a) 1 ~ p ' /2 E 2 -^ s i- 2 "( 1 - p / 2 ^ 

J=m 

< Cyl 2 (x £ a) 1_p ' /2 2 Jop [( 1 / p " 1 / 2 ^ 2ly+1 ^ Sl ] [ln(l/e)l 1 ( Sl= ( 1 /p-l/2)(2^+l))_ 
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Plugging in jo of the form (I8.17p . obtain as e — > 

A 3 = O (A 2 x d eA [i n (i/ £ )]K S i=(i/ P -i/2)(2^+i))^ (8 35) 

Now, to complete the proof, combine formulae (|8. 18[) — f)8.35[) . 

8.4 Proofs of the statements in Section [51 
Proof of Lemma U Subtracting (3j,k,j',k' from f)5. 1 [) . one obtains 

o M z 

/3j,k,j',k' — Pj,k,j',k' = T7 E ^j,k,m E —7 — r Vj',k'(ui). (8.36) 

m&Wj 1=1 gm ^ Ul > 

where z m (u{) = y m {ui) — h m (ui). Since Fourier transform is an orthogonal transform, one 
has E[z mi (u h )z m2 (ui 2 )] = if h 7^ h and E[z mi (ui)z m2 (ui)] = 0, so that 

Therefore, 

2 M 1 

Var(^ )fej/>fc ,) = IV^J'E \ a ( u ,)\2 \Vj>,k'(ui)\ 2 

m&Wj 1=1 |ymV Ul 

a 2 2 2 ^ ^ . , |2 1 A, . . |2 a 2 2 2 ^ 

which completes the proof. 

Proof of Lemma [5J Recall that 

/cm) = E ^ fy,k,?,k'' i i ) j,k( t ) r )f,k'( u ) and /*(*) =^2 b j,k' { pj,k(t)vj>,k>(ui), 

j,k j',k' j,k 

so that 

00 

where the set if; = {A;' : r/(2- ? 'ti/ — fc') 7^ 0} is finite for any / due to finite support of 77. 
Thus, since p > 1, for any <5 > 0, one has 

2 

£ £ |/W,^'' M/2 2-^/ 2 
f=o k'eKx 



2 J -1 2 J -1 

Ei & Si p * C E 

fc=0 fc=0 



2J-1 / 00 \ / 00 _f X '' 1 

* C E E E iftwi^' 1 ^ E E (T" a/2V 

k=o \j'=o fc'e/fj / V/'=o fc'e-fQ 
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Then, for any q > 1, one has 



q/p 



k,k> 



If q/p > 1, then, using Cauchy-Schwarz inequality again, it is straightforward to verify 
that 



Hence, 



'2^-1 



B S <C S Y, 

j'=o 



g/p 



'j,k,j',k'\ 



k,k' 



g/p 



)'(l+25)g/2 



j'=0 



k=0 



j'=0 



E1/ 3 . 



'j,k,j',k'\ 



k,k' 



q/p 



< C s A q = A q 



provided sZ, > (1 + 2S)/2. Since S2 > max{l/2, l/p} implies S2 > 1/2, choose 5 
(s2 — l/2)/2. If q/p < 1, then similar considerations yield 



B j <CsY, 

i'=o 



^\Pj,k,j',k'\ P 



k.k 1 



g/p 



2 j'(l+S)q/2 



so that the previous calculation holds with 5 instead of 25, and the proof is complete. 
8.5 Proofs of the statements in Section [6l 

Proof of Theorem [4j Repeating the proof of Theorem [1] with f and k' replaced by j' 
and k', respectively, and S2j' replaced by j'^Sj, we again arrive at two cases. Denote the 
r-dimensional vector with all unit components by e. 

In the dense-dense case, we use (r + l)-dimensional array w, so that N = 2 J+e J . 

Choose 7?j, = J 42 2 -j(2s 1 +l)-j' T (2s 2 +e) and obgerve that /) < Ce -2 7 2^ 2 i+e r j' 2 -2(,j 

Now, applying Lemma [6] with 

\:*7) 



5 2 = l] r 2 j+e j '/32 = A 2 2- 2sij ~ 2i ' 82 /32 
one arrives at the following optimization problem 



2jsi + 2j' S2 =► min j{2s x + 2v + 1) + E( 2s 2,z + > r e , j, j| > 0, (8.38) 



i=i 



where r £ is defined in formula (|8.5[) . Setting j = r e / (2si + 2^ + 1) — X)I"=i ( 2s z + 1)/ (2si + 
2^ + 1), arrive at optimization problem 



Y + E — o„ , o.. i i — =*- mm > ^ > o, z = i, 

z=l 



2si + 2v + 



2si + 2u + l 



, r. 



.39) 
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If s 2> i (2v + 1) > si, then each j[ is multiplied by a nonnegative number and minimum is 
attained when j', = 0, / = 1, • • • , r. Then, j = t £ /(2s\ + 2v + 1). On the other hand, if 
s 2 j (2i/+ 1) < si, then ji is multiplied by the smallest factor which is negative. Therefore, 
minimum in (|8.39p is attained if j = 0, j[ = 0, I ^ Iq and ji = t £ /(2s 2i i + 1). Plugging 
those values into (|8.37p . obtain 

5 2 = | CA 2 (e 2 /A 2 )^\ if fll > 82 ,o(2i/ + 1), (g 40) 

CM 2 (e 2 /^ 2 ) 2si +^ +1 , if si < s 2 , (2i/ + l). 

In the sparse-dense case, we use r-dimensional array to, so that N = 2 e 3 . Choose 



ing Lemma [6] with 



7 2., = ^2 2 -2i4-j' (2s 2 +e) and observe that K{fJ) < C ^ ^ +e J'V 2 ^. Now, apply- 



S 2 = A 2 2- 2s * j - 2j " 82 /32 (8.41) 
one arrives at the following optimization problem 

r 

2jsi + 2j's 2 =► min j(2sj + 2^ + 1) + ^( 2s 2,z + > r e , j, j| > 0, (8.42) 

1=1 

Again, setting j = t £ /(2s\ + 2u) — X^F=i(2^ + l)/(2s* +2v), arrive at optimization problem 

2s\t £ ^ 2j[[2s 2 ^-s\] _ . 

+ Z> 9g * T 9 „ — =>mm, > 0, / = !,-•• ,r. (8.43) 



2s? + 2v ^ 2s? + 2v 

Repeating the reasoning applied in the dense-dense case, we obtain j = 0, j[ = 0, I ^ Iq 
and j to = t £ /(2s 2 ,i + 1) if 2s 2 ,i Q v < s*, and j = r e /(2si + 2^ + 1), $ = 0, I = 1, • • ■ , r, if 
2s2,z ^ > s*. Plugging those values into (|8.41|) . obtain 



•2.0 



j2 = I cm 2 (e 2 M 2 ) if 2^ 2 ,o < (g44) 

1 CA 2 {s 2 /A 2 )^i+^, if 2z/s 2) o > s\. 
In order to complete the proof, combine (|8.40p and (|8.44p and note that s* = s[ if p < 2. 

Proof of Theorem [5]. Repeat the proof of Theorem [2] with j' and k' replaced by j' and 
k', respectively, s 2 j' replaced by i >T s' 2 and 



2 J0 = (Xs,a) ^, 2^ = { X£ ,a) 1 = 1, 



, r. 



Then, formulae ()8. 18j) — (|8.26|) are valid. One can also partition A in ()8.27|) into Ai, A2 
and A3 given by expressions similar to (I8.28|) , (|8.29p and (I8.30j) with r + 1 sums in (|8,28p 
instead of two, Y] J °_ , replaced by r respective sums and 1 ( 2^ 2v+1 > +1 ' > xt~A ) replaced 

by 1 (2^ 2u+1 )+ eT y > x^a) ■ Then > u PP er bounds ([OTP and (^32]) hold. In order to 
construct upper bounds for A3, we again need to consider three different cases. 
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In Case 1, s x > s 2 ,o(2v + 1), replace Yj n ° =w > b Y X),°'= m ' and Yjj=m b y the sum 
over j, > J/ -i; i/ +i' ' ' ' >ir- Repeating calculations for this case, keeping in mind 

that s' 2 i > s' 2 o for any / and noting that, whenever s' 2 i = S2 q> we gain an extra logarithmic 
factor, we arrive at 

A 3 = O [A 2 x %a [ln(l/e)] 1{si=S2(2 " +1))+E ^o Hw^'fl)") . (8.45) 

In Case 2, (1/p - l/2)(2v + 1) < si < s 2 ,o(2^ + 1), replace J2f =m ' ^ Sj'eT(m'j^) wri ere 
Jo = Wo i' ' ' ' 'ior) and arrive at ()8.34p . In Case 3, s± < — |)(2i^ + 1), since the sum 
over j' is uniformly bounded, calculations for the two-dimensional case hold and (|8.35[) is 
valid. Combination of (|8.45p . (|8.34p and (|8.35 j) completes the proof. 
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